Fingerprinting Ratings for Collaborative Filtering - Theoretical and Empirical Analysis

نویسندگان

  • Yoram Bachrach
  • Ralf Herbrich
چکیده

We consider fingerprinting methods for collaborative filtering (CF) systems. In general, CF systems show their real strength when supplied with enormous data sets. Earlier work already suggests sketching techniques to handle massive amounts of information, but most prior analysis has so far been limited to non-ranking application scenarios and has focused mainly on a theoretical analysis. We demonstrate how to use fingerprinting methods to compute a family of rank correlation coefficients. Our methods allow identifying users who have similar rankings over a certain set of items, a problem that lies at the heart of CF applications. We show that our method allows approximating rank correlations with high accuracy and confidence. We examine the suggested methods empirically through a recommender system for the Netflix dataset, showing that the required fingerprint sizes are even smaller than the theoretical analysis suggests. We also explore the of use standard hash functions rather than min-wise independent hashes and the relation between the quality of the final recommendations and the fingerprint size.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A NOVEL FUZZY-BASED SIMILARITY MEASURE FOR COLLABORATIVE FILTERING TO ALLEVIATE THE SPARSITY PROBLEM

Memory-based collaborative filtering is the most popular approach to build recommender systems. Despite its success in many applications, it still suffers from several major limitations, including data sparsity. Sparse data affect the quality of the user similarity measurement and consequently the quality of the recommender system. In this paper, we propose a novel user similarity measure based...

متن کامل

یک سامانه توصیه‎گر ترکیبی با استفاده از اعتماد و خوشه‎بندی دوجهته به‎منظور افزایش کارایی پالایش‎گروهی

In the present era, the amount of information grows exponentially. So, finding the required information among the mass of information has become a major challenge. The success of e-commerce systems and online business transactions depend greatly on the effective design of products recommender mechanism. Providing high quality recommendations is important for e-commerce systems to assist users i...

متن کامل

Effect of Rating Time for Cold Start Problem in Collaborative Filtering

Cold start is one of the main challenges in recommender systems. Solving sparsechallenge of cold start users is hard. More cold start users and items are new. Sine many general methods for recommender systems has over fittingon cold start users and items, so recommendation to new users and items is important and hard duty. In this work to overcome sparse problem, we present a new method for rec...

متن کامل

A Novel Trust Computation Method Based on User Ratings to Improve the Recommendation

Today, the trust has turned into one of the most beneficial solutions to improve recommender systems, especially in the collaborative filtering method. However, trust statements suffer from a number of shortcomings, including the trust statements sparsity, users' inability to express explicit trust for other users in most of the existing applications, etc. Thus to overcome these problems, this ...

متن کامل

Deriving Private Information from Randomly Perturbed Ratings

Collaborative filtering techniques have become popular in the past several years as an effective way to help people deal with information overload. An important security concern in traditional recommendation systems is that users disclose information that may compromise their individual privacy when providing ratings. Randomized perturbation schemes have been proposed to disguise user ratings w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010